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Abstract 

The collision problem is to decide whether a function X : {!,... ,n} {1, ■ • • ,?^} is one-to-one or 
two-to-one, given that one of these is the case. We show a lower bound of Q (n^/^) on the number of 
queries needed by a quantum computer to solve this problem with bounded error probability. The best 
known upper bound is O (ji^^^^ , but obtaining any lower bound better than Q (1) was an open problem 
since 1997. Our proof uses the polynomial method augmented by some new ideas. We also give a 
lower bound of Q, (ji^''''^ for the problem of deciding whether two sets are equal or disjoint on a constant 
fraction of elements. Finally we give implications of these results for quantum complexity theory. 

1 Introduction 

The power of quantum computing has been intensively studied for a decade ^, ^ ||, |], ^ . Apart 
from possible applications — such as speeding up combinatorial search and breaking public-key cryptog- 
raphy pH — a major motivation for this work has been to better understand quantum theory itself. Thus, 
researchers have tried to discover not just the capabilities of quantum computing but also the limitations. 
This task is difficult, though; proving (for example) that quantum computers cannot solve A'^P-complete 
problems in polynomial time would imply P ^ NP . 

A popular alternative is to study restricted models of computation, and particularly the query model, in 
which one counts only the number of queries to the input, not the number of computational steps. An early 
result of Bennett, Bernstein, Brassard, and Vazirani ^ showed that a quantum computer needs Vt {\/n) 
queries to search a list of n items for one marked item. (This bound is tight, as evidenced by Grover's 
algorithm [^.) Subsequently, Beals et al. Ambainis 0, and others obtained lower bounds for many 
other problems. 

But one problem, the collision problem, resisted attempts to prove a lower bound [|[ Because of 
its simplicity, the problem was widely considered a benchmark for our understanding of quantum query 
complexity. The collision problem of size n, or Col„, is defined as follows. Let X = xi . . . x^he & sequence 
of n integers drawn from {!,... , n}, with n even. We are guaranteed that either 

(1) X is one-to-one (that is, a permutation of {1, . . . , n}), or 



*Computer Science Department, University of California, Berkeley, CA 94720-1776. Email: aaronsoni3cs.berkeley.edu. 
Supported in part by a National Science Foundation Graduate Fellowship and by the Institute for Quantum Information at the 
California Institute of Technology. 



1 



(2) X is two-to-one (that is, each element of {1, . . . ,n} appears in X twice or not at all). 



The problem is to decide whether (1) or (2) holds. 

We show that Q2 (Col„) = Q (n-'^/^), where Q2 is bounded-error quantum query complexity as defined 
by Beals et al. |^ . Details of the oracle model are given in Section |^. The best known upper bound, due 
to Brassard, H0yer, and Tapp Q, is O (n^/"^); thus, our bound is probably not tight. Previously, though, 
no lower bound better than the trivial (1) bound was known. How great a speedup quantum computers 
yield for the problem was apparently first asked by Rains [isl . 

Previous lower bound techniques failed for the problem because they depended on a function's being 
sensitive to many disjoint changes to the input. For example, Beals et al. showed that for all total 

Boolean functions /, Q2 (/) = ^ (^^/bs{f)j , where bs(/) is the block sensitivity, defined by Nisan [^g) to 

be, informally, the maximum number of disjoint changes (to any particular input X) to which / is sensitive. 
In the case of the collision problem, though, every one-to-one input differs from every two-to-one input in at 
least n/2 places, so the block sensitivity is O (1). Ambainis' adversary method as currently formulated, 
faces a related obstacle. In that method we consider the algorithm and input as a bipartite quantum state, 
and upper-bound how much the entanglement of the state can increase via a single query. Yet under the 
simplest measures of entanglement, the algorithm and input can become highly entangled after O (1) queries, 
again because every one-to-one input is far from every two-to-onc input. 

Our proof is an adaptation of the polynomial method, introduced to quantum computing by Beals et al. 
1^. Their idea was to reduce questions about quantum algorithms to easier questions about multivariate 
polynomials. In particular, if a quantum algorithm makes T queries, then its acceptance probability is a 
polynomial over the input bits of degree at most 2T. So by showing that any polynomial approximating 
the desired output has high degree, one obtains a lower bound on T. 

To lower-bound the degree of a multivariate polynomial, a key technical trick is to construct a related 
univariate polynomial. Beals et al. using a lemma due to Minsky and Papert | p^ , replace a polynomial 
p{X) (where X is a bit string) by gd^j) (where \X\ denotes the Hamming weight of X), satisfying 

q{k)^ EX p{X) 

\X \ — h 

and deg {q) < deg (p) . 

We construct the univariate polynomial in a different way. We consider a uniform distribution over 
fc-to-one inputs, where k might be greater than 2. Even though the problem is to distinguish fc = 1 from 
k — 2, the acceptance probability must lie between and 1 for all k, and that is a surprisingly strong 
constraint. We show that the acceptance probability is close to a univariate polynomial in k of degree at 
most 2T. We then obtain a lower bound by generalizing a classical approximation theory result of Ehlich 
and Zeller [|ll] and Rivlin and Cheney |l9| . Much of the proof deals with the complication that k does not 
divide n in general. 

Shi [pot has recently improved our method to obtain a lower bound of fl (n^/^) for the collision problem. 
The paper is organized as follows. Section motivates the collision lower bound within quantum 
computing, pointing out connections to collision-resistant hash functions, the nonabelian hidden subgroup 



problem, and information erasure. Section |3| gives technical preliminaries. Section 
fact that the acceptance probability is "almost" a univariate polynomial, and Section 

bound argument. In Appendix ^ we show a lower bound of Q, (n^/^) for the set comparison problem, a 
variant of the collision problem that is needed for the application to information erasure. 
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2 Motivation 



The most immediate implication of the cohision lower bound is that certain problems, notably breaking 
cryptographic hash functions, are not in BQP relative to an oracle. A second implication is that a nonstan- 
dard quantum oracle model proposed by Kashefi et al. is exponentially more powerful than the usual 
oracle model. A third implication, in our view the most interesting one, concerns the computational power 
of so-called dynamical quantum theories. That implication will be discussed in detail in another paper. 

2.1 Oracle Hardness Results 

The original motivation for the collision problem was to model (strongly) collision-resistant hash functions 
in cryptography. There is a large literature on collision-resistant hashing; see H, for example. When 
building secure digital signature schemes, it is useful to have a family of hash functions {Hi}, such that 
finding a distinct {x,y) pair with Hi (x) — Hi (y) is computationally intractable. A quantum algorithm 
for finding collisions using O (polylog (n) ) queries would render all hash functions insecure against quantum 
attack in this sense. (Shor's algorithm already renders hash functions based on modular arithmetic 
insecure.) Our result indicates that collision-resistant hashing might still be possible in a quantum setting. 

The collision problem also models the nonahelian hidden subgroup problem, of which graph isomorphism 
is a special case. Given a group G and subgroup H < G, suppose we have oracle access to a function 
/ : G — > N such that for all ffi, 32 G G, / (gi) = f (32) if and only if gi and 32 belong to the same coset of 
H. Is there then an efficient quantum algorithm to determine H7 If G is abelian, the work of Simon 
and Shor implies an affirmative answer. If G is nonabelian, though, efficient quantum algorithms are 
known only for special cases An O (polylog (n))-query algorithm for the collision problem would 

yield a polynomial-time algorithm to distinguish = 1 from \H\ = 2, which does not exploit the group 
structure at all. Our result implies that no such algorithm exists. 

2.2 Information Erasure 

Let / : {0, 1}" — > {0, 1}'" with m > n be a one-to-one function. Then we can consider two kinds of quantum 
oracle for /: 

(A) a standard oracle, one that maps |a;) to |a;) \z (B f (x)), or 

(B) an erasing oracle (as recently proposed by Kashefi et al. which maps \x) to |/(a;)), in effect 
"erasing" \x). 

Intuitively erasing oracles seem at least as strong as standard ones, though it is not clear how to simulate 
the latter with the former without also having access to an oracle that maps \y) to |/~^ (j/))- The question 
that concerns us here is whether erasing oracles are more useful than standard ones for some problems. One- 
way functions provide a clue: if / is one-way, then (by assumption) |a;) \f (x)) can be computed efficiently, 
but if |/(a;)) could be computed efficiently given \x) then so could \x) given |/(a;)), and hence / could be 
inverted. But can we find, for some problem, an exponential gap between query complexity given a standard 
oracle and query complexity given an erasing oracle? 

In Appendix ^ we extend the collision lower bound to show an affirmative answer. Define the set 
comparison problem of size n, or SetComp„, as follows. We are given as input two sequences, X — xi . . . Xn 
and Y — yi . . . y.n, such that for each i, Xi, yi G {1, . . . , 2n}. A query has the form (b, i), where 6 S {0, 1} and 
I € {1, . . . ,n}, and produces as output (0,2;^) if 6 = and (l,yi) if 5 = 1. Sequences X and Y are both 
one-to-one; that is, Xi ^ Xj and yi ^ yj for all i ^ j . We are furthermore guaranteed that either 
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(1) X and Y are equal as sets (that is, {xi, . . . , x„} = {yi, . . . , ?/„}) or 

(2) X and Y are far as sets (that is, |{xi, . . . ,a;„} U {j/i, . . . ,yn}| > l.ln). 
As before the problem is to decide whether (1) or (2) holds. 

This problem can be solved with high probability in a constant number of queries using an erasing oracle, 
by using a trick similar to that of Watrous |p3|| for verifying group non-membership. First, using the oracle, 
we prepare the uniform superposition 

2^ E m\^^) + W\y^))■ 
ie{l,... ,n} 

We then apply a Hadamard gate to the first register, and finally we measure the first register. If X and Y are 
equal as sets, then interference occurs between every (|0) \z) , |1) \z)) pair and we observe |0) with certainty. 
But if X and Y are far as sets, then basis states \b) \z) with no matching |1 — h) \z) have probability weight 
at least 1/10, and hence we observe |1) with probability at least 1/20. 

In Appendix ^ we show that Q2 (SetComp„) — D, (n^/^); that is, no efficient quantum algorithm using a 
standard oracle exists for this problem. 



3 Preliminaries 

Let A be a quantum query algorithm. A basis state of A is written |^', z, z). Then a query replaces each 
|\l/,i,z) by ® Xi^i, z) , where Xi is exclusive-OR'ed into some specified location of 4* (which we cannot 
assume to be all O's). We assume without loss of generality that every basis state queries at every step. 
Between queries, the algorithm can perform any unitary operation that does not depend on the input. At 
the end z is measured in the standard basis; if z = 1 the algorithm returns 'one-to-one' and if z = 2 it returns 
'two-to-one.' The total number of queries is denoted T. Also, we assume for simplicity that all amplitudes 
are real; this restriction is without loss of generality 0. 

Let c^x^iz amplitude of basis state after t queries when the input is X. Also, let 

A (xi, /i) = 1 if Xi = ft-, and A (x^, /i) = if Xi ^ h. Let P {X) be the probability that A returns z — 2 when 
the input is X. Then we obtain a simple variant of the main lemma of Beals et al. [Q. 

Lemma 1 P {X) is a multilinear polynomial of degree at most 2T over the A {xi, h). 

Proof. We show, by induction on t, that for all basis states j^*, i, z), a^^^, ^ ^ is & multilinear polynomial 
of degree at most t over the A {xi, h). Since P (X) is a sum of squares of a^x^i i z' lemma follows. 

The base case {t ~ 0) holds since, before making any queries, each o'x^ i z '^^ ^ degree-0 polynomial over 
the A (xi ,h). A unitary transformation on the algorithm part replaces each a^''^ i z tiy a linear combination 
of cxx\ i z-i hence cannot increase the degree. Suppose the lemma holds prior to the t*'' query. Then 

l<h<n 

and we are done. ■ 

A remark on notation: we sometimes use brackets (a6[c]) rather than nested subscripts (a;,^). 
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4 Reduction to Bivariate Polynomial 

Call the point {g,N) G SR^ an {n,T)-quasilaUice point if and only if 

(1) g and N are integers, with g dividing N, 

(2) l<g<V^, 

(3) n < N < n + n/ (lOT), and 

(4) if 5 = 1 then N = n. 

For quasilattice point {g, N), define I?„ {g, N) to be the uniform distribution over all size-n subfunctions 
of g-1 functions having domain {!,... , N} and range a subset of {1, . . . ,n}. More precisely: to draw an 
X from T>n {g, N), we first choose a set S C {1, ... , n} with \S\ = N/g < n uniformly at random. We then 
choose a g-\ function X = xi . . . xn from {1, . . . , to S uniformly at random. Finally we let xi = Xi for 
each 1 < i < n. 

Let P {g, N) be the probability that algorithm A returns z = 2 when the input is chosen from I?„ {g, N): 

P{g,N)= EX P{X). 

XeV[n]ig.N) 

We then have the following surprising characterization: 

Lemma 2 For all sufficiently large n and ifT< y^/S, there exists a bivariate polynomial q {g, N) of degree 
at most 2T such that if (g, N) is a quasilattice point, then 

\P{g,N)-q{g,N)\< 0.182 

(where the constant 0.182 can be made arbitrarily small by adjusting parameters). 

Proof. Let / be a product of A {xi, h) variables, with degree r (/), and let / {X) £ {0, 1} be / evaluated 
on input X . Then define 

7(/,g,7V)= EX I{X) 

X&V[n]{a,N) 

to be the probability that monomial / evaluates to 1 when the input is drawn from I?„ {g,N). Then by 
Lemma 0, P {X) is a polynomial of degree at most 2T over X, so 

P{g,N)= EX P{X)= EX y PiI{X)= y j3n{I,g,N) 

^ ' ^ ' I:r(I)<2t I:r(I)<2T 

for some coefficients /?/. 

We now calculate 7 (/, g, N). Assume without loss of generality that for all A (z^, hi) , A (xj, /12) G I, 
either i ^ j or hi — h2, since otherwise 7 (/, g, N) — 0. 

Define the "range" Z (/) of / to be the set of all h such that A {xi,h) e I. Let w {I) = \Z then 
we write Z {!) = jzi, . . . , 2;„,(7)}. Clearly 7 (J, 5, TV) = unless Z (/) e 5, where S is the range of X. By 
assumption, 

N n 

— >^>'2T>r{I) 

g 
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so the number of possible S is 



n 

N/g, 



n — w (/) 
N/g-w{I), 



and, of these, the number that contain Z is 

Then, conditioned on Z G S, what is the probabiHty that 'y{I,g,N) = 1? The total number of g-1 
functions with domain size is N\/ (gl)^^^ , since we can permute the N function values arbitrarily, but 
must not count permutations that act only within the N/g constant- value blocks of size g. 

Among these functions, how many satisfy 'y{I,g,N) = 1? Suppose that, for each 1 < j < w {I), there 
are rj (7) distinct i such that A {xi, zj) G I. Clearly 

n + + {l) = r{l). 

Then we can permute the (A'^ — r (1))] function values outside of I arbitrarily, but must not count permuta- 
tions that act only within the N/g constant-value blocks, which have size either g or g — Vi (/) for some i. 
So the number of functions for which 7 (/, g,N) = 1 is 

(7V-r(/))! 



iglff^-^^^l[-^[\g-nil)y: 



Putting it all together. 



7(^,5,^) = 



n — w{I) 

N/g - w (/) 



{N-riI)y.{glf^' 



n 

,N/g, 

{N -r{I))l{n-w{I))l{N/g)l 



iVinjjf (5 -r. (7))! 



N\n\ {N/g-w{I))\ 



{N-r{I))\ {n-w{I))\ 



v{I)-l 



i=0 
2T-1 



n (T-)n 



w(I) 



i=l 
w(I)-l 



r[i]il)-l 

fj n (9-j) 

w(I)r[i]{I)-l 



A^! ni 
{N - 2T)ln\ 



i=r(I) 



i=0 



i=l j=l 



N\ (n - 2r)! 



qn,T,i {g, N) 



where 



/ fT\\\f orrW 2T-1 w{I)-l w{I)r[i](I)-l 

AT) = (----( )) ("-^^)' . n (N-^) n (^-^on n (^-i) 

is a bivariate polynomial of total degree at most 

(2r - r (7)) +wil) + (r (7) - w (7)) = 2T. 

(Note that in the case (7) > g for some i, this polynomial evaluates to 0, which is what it ought to do.) 
Hence 

Pig,N)= Pn{I,9,N)= [^,Z^^}!j^i li9,N) 



I:r(I)<2T 



N\ (n-2T)! 
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where 



q{g,N)= Mn,T,i{9,N). 

I:r{I)<2T 



< 1. 



Clearly 

{N~2T)\n\ 
Nl {n-2T)\ 

Since N <n + n/ (lOT) and T < ^/n/^i, we also have 

[N - 2T)\n\ ^ ( n-2T +1^^'^ 



N\ {n~2T)\ - \N -2T+1 



^ n/{lQT) 



> 1- 



> exp 

> 0.818 



iV - 2r + 1 

n 1 
10[n-(2r+l)/?i]T 

1 n 
^5n - (2T+ 1) /n 



2T 



for all sufficiently large n. 

Thus, since < P(g,7V) < 1, 



and we are done. 



\P{9,N)-q{g,N)\< 0.182 



5 Lower Bound 

We are now ready to prove a lower bound for the collision problem. To do so, we generalize an approximation 
theory result due to Rivlin and Cheney |19 and (independently) Ehlich and Zeller That result was 

applied to query complexity by Nisan and Szegedy and later by Beals et al. [Q. 

Theorem 3 Q2 (Col„) = fl (n^/s) . 

Proof. Let g have range 1 < g < G. Then the quasilattice points (g, N) all lie in the rectangular region 



R 



[1, G] X [n,n + nj (lOT)]. Recalling the polynomial q (g. iV) from Lemma g, define 
d[q) 



max 



dq 

dg 



ior(G- 1) 



dN 



Suppose without loss of generality that we require P(l,n) < 1/10 and P {2,n) > 9/10 (that is, algorithm A 
distinguishes 1-1 from 2-1 functions with error probability at most 1/10). Then, since 



|P(.g,7V)-(7(5,iV)| <0.182 
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by the Intermediate Value Theorem we have 



d (q) > max ^ > 0.8 - 2 (0.182) = 0.436. 

l<g<2 dg 



An inequality due to Markov (see g states that, for a univariate polynomial p, if bi < p{x) < 62 
for all fli < X < a2, then 



max 

all]<x<al2] 



dp (x) 



dx 



^ b2-bi 2 
< deg (p) 

02 — fli 



Clearly for every point (jj, 
N -N 



e R, there exists a quasilattice point (g, N) for which I5 — 5I < 1 and 

< G. For take g ~ [g] — or, in the special case g — 1, take g — 2, since there is only one 

quasilattice point with g = I. 

Furthermore, since P {g, N) represents an acceptance probability at such a point, we have 



-0.182 <q{g,N) < 1.182. 



Observe that for all 



-0.182 



e R, 



lOTG (G-l) 



+ 1 diq)<q[g,N < 1.182 



lOTG (G-l) 



+ 1 d(g). 



For consider a quasilattice point close to ^g, , and note that the maximum- magnitude derivative is at 

most d (q) in the g direction and lOT (G — 1) d (q) /n in the N direction. 

Let {g*^N*) be a point in R at which the weighted maximum-magnitude derivative d{q) is attained. 
Suppose first that the maximum is attained in the g direction. Then q{g,N*) (with N* constant) is a 
univariate polynomial with 



dq{g,N*) 



for some 1 < 5 < G. So 



dg 



2r>deg {q {g,N)) 
>deg((7 ig,N*)) 



> 0.436 



> 



> 



d{q) (G-l) 



1.364 + 2d {q) (1 + lOTG (G - 1) /ii) 



0.436 (G- l)n 



2.236n + 8.720TG (G-l) 



Q min 



tg} 
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Similarly, suppose the maximum d {q) is attained in the N direction. Then q {g* , N) (with g* constant) 
is a univariate polynomial with 



dq{g\N) 



dN 



^ 0.436T(G - 1) 



for some n < N <n + n/ (lOT). So 
2T > 



(lOT (G- 1) /n)d{q) n/(10T) 



1.364 + 2d (g)(l + 10TG (G- 
One can show that the lower bound on T is optimized when we take G = n^'^ < V^. Then 



T = l^min { n 
T = n f 



1/5 V» 



Tni/5 



and we are done. 
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7 Appendix: Set Comparison 



Here we show that Q2 (SetComp„) = il (n^/^), where SetComp„ is the set comparison problem of size n 
as defined in Section p.2| . We need only redo the proof of Lemma ||; then Theorem |^ goes through largely 
unchanged. 

The idea is the following. We need a distribution of inputs with a parameter g, such that the inputs are 
one-to-one when g — 1 01 g — 2 — since otherwise the problem of distinguishing 5 = 1 from g = 2 would be 
ill-defined for erasing oracles. On the other hand, the inputs must not be one-to-one for all g > 2 — since 
otherwise the lower bound for standard oracles would apply also to erasing oracles, and we could not obtain 
a separation between the two. Finally, the algorithm's acceptance probability must be close to a polynomial 
in g. 

Our solution is to consider k {g)-to-one inputs, where 



K (g) = V - 12,9 + 9. 

is a quadratic with k (1) = k (2) = 1. The total range of the inputs (on sequences X and Y combined) has 
size roughly n/g; thus, we can tell the g = 1 inputs apart from the g = 2 inputs using an erasing oracle, 
even though k (g) is the same for both. The disadvantage is that, because k (g) increases quadratically 
rather than linearly in g, the quasilattice points become sparse more quickly. That is what weakens the 
lower bound from (n^/^) to ft (n^/^). We note that, using the ideas of Shi |2^, one can improve our lower 
bound on Q2 (SetComp„) to Q (n^/^). 

Call {g,N,M) e an {n,T)- super- quasilattice point if and only if 

(1) g is an integer in [l,n^/^], 

(2) N and M are integers in [n, n{l + 1/ (lOOT))], 

(3) g divides N, 

(4) if g = 1 then N = n, 

(5) K [g) divides M, and 

(6) if 5 = 2 then M n. 

For super-quasilattice point {g,N,M), we draw input {X^Y) = (xi . . . yi . . . y„) from distribution 
£n{g,N,M) as follows. We first choose a set S C {1,... ,2n} with |5| — 2N / g < 2n uniformly at 
random. We then choose two sets Sx,Sy C S with \Sx\ = \Sx\ = M/n{g) < \S\, uniformly at random 
and independently. Next we choose k {g)-l functions X = xi . . . xn '■ {1, • ■ • , M} Sx and Y = yi . . .yx 
: {1, . . . , M} — !■ Sy uniformly at random and independently. Finally we let Xi — Xi and yi = yt for each 
1 < i < n. 

Define sets Xs = {xi, . . . ,Xn} and Xs = {ui, ■ ■ ■ lUn}- Suppose 5 = 1 and N = M = n; then by 
Chernoff bounds, 

Pr [iXsUFsl < l.lnl < 2e""/^°. 

(X,y)e£[n](l,n,n) 

Thus, if algorithm A can distinguish \Xs U ls| = n from \Xs U I5I > l.ln with probability at least 9/10, then 
it can distinguish {X,Y) £ £„ (l,n,n) from {X,Y) e £„ (2,rt, n) with probability at least 9/10 — 2e^"/^°. 
So a lower bound for the latter problem implies an equivalent lower bound for the former. 
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Define P {X, Y) to be the probability that the algorithm returns that X and Y are far on input {X, Y), 
and let 



P(g,N,M)= EX P(X,Y) 

iX,Y)eC[n]ig,N,M) 



We then have 



Lemma 4 For all sufficiently large n and if T < n^^^/S, there exists a trivariate polynomial q {g, N, M) of 
degree at most 8T such that if {g,N,M) is a super- quasilattice point, then 

\P{g,N,M)-q{g,N,M)\<e 

for some constant < e < 1/2. 

Proof. By analogy to Lemma |l|, P {X, Y) is a multilinear polynomial of degree at most 2T over variables 
of the form A {xi, h) and A {i/i, h). Let / {X, Y) — Ix {X) ly {Y) where Ix is a product of rx {!) distinct 
A {xi, h) variables and ly is a product of ry (!) distinct A {yi, h) variables. Let r (/) = rx {I) + ry (/). 
Define 



then 



-f{I,g,N,M)^ EX I{X,Y); 

(X,Y)eC[n]{g,N,M) 



P{g,N,M)= PniI,9,N,M) 

I:r{I)<2T 



for some coefhcients /?/. 

We now calculate 7 (/, <?, M). As before we assume there are no pairs of variables A (x^, hi) , A (xi, /i2) S 
/ with hi 7^ /i2. 

Let Zx {I) be the range of Ix and let Zy (/) be the range of ly. Then let Z (/) = Zx (I) U Zy (/). Let 
wx (/) ^\ZxiI)\, wy (/) ^\Zy{I)l and w{I)^\Z{I)\. By assumption 



N M 1 I/O 

— > > -n ' > 2T 

9 i^ig) 4 



so 



Pr [Z (I) C S] 



/ 2n-w{I) 
[2N/g - w (I) 

VN/gJ 



Then the probability that Zx (I) Q Sx given Z {!) C S is 

/ 2N/g~wx{I) 
\M/k (5) - wx (/) 



( 'iN/g 
\M/K{g) 

12 



and similarly for the probability that Zy (/) C Sy given Z (I) C S. 

Let rx,i (/),... ,rx,w[x]{i) {I) be the multiplicities of the range elements in Zx (/), so that rx,i {I) + 
1- rx,w[x]ii) (^) = '''X (I)- Then 



Ft[Ix{X) I Zx{I)CSx] 

and similarly for Pr [ly {Y) \ Zy (/) C Sy]. 
Putting it all together, 



M! 



jiI,g,N,M) 



2N/g-wx (/) 



/ 2n-w(/) 

\2N/g - w {!)) {M - rx (/))! (M - ry (7))! VM/k (.g) - wx (/) 



/ 2n X 



w[X](I)r[X,t]{I)-l 

n n 



M! M! 

2Ar/5 - wy (!) 

M/ac (<?) - wy (I) 



2N/g 
M/k {g) 



w[Y]{I)r[Y,i]{I)-l 

n n (^ia)-j) 

i=l j=0 



2iV/.9 
M/k (.g) 

(2n - (/))! (M - rx (/))! (M - ry (/))! (27V/g - w;^ (/))! {2N/g - wy (/))! 



(2n)! 



M! 



M! 



(27V/5-t«(7))! (27V/5)! 



Oi {g,M) 



where 



Oi ig,M) = 

wlX](I)-l wlX]{I)rlX.i](I)-l ii)[Y](/)-l w[Y]{I)rlY.i](I)-l 

n {M-iK{g)) n n {^i.9)-j) n n n i^ia)-3) 



i=0 



i=l j=l 



i=0 



is a bivariatc polynomial in [g, M) of total degree at most 2r (/). 
Thus 



(2n- w(J))! 
(2^0! 



(M - 2r)!n! 
M! [n - 2T)\ 



[n - 2T)\ 



2 2T-1 



i=l 3=1 



2T-1 



i=r{X]{I) i=r[Y]{I) 



i2N/g-wx{I))---{2N/g-iw{I)-l)) 
{2N/g) {2N/g - 1) • • • i2N/g - {wy (!) - 1)) ' > 



(2n) 



2T 



(27V) (2iV-5)---(2iV-(2r-l)5) 



(M - 2r)!n! 
M! (n - 2Ty. 



qn,T,i [g, N, M) 



where 



5n,T,/ {g, N, M) = 



(2n - «;(/))! /'(n-2r)! 



(2n)! {2n 



,2T 



n! 



2T-1 

(/)^^ (5, M) [] (M - i) X 

i=r[X](/) 

2T-1 w(I)-l 2T-1 

n (M-o n (2^^-^5) n (2^-^^) 

i=r[y](/) i=w[X](I) i=w[Y\(I) 
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is a trivariate polynomial in {g, N, M) of total degree at most 

{AT - r (/)) + 2r (/) + (wx (/) + wy (/) - w (/)) + [w (/) - wx (/)) + (2T - (/)) < 8T. 

Thus 



P(5,iV,M) 



(2n 



,2T 



2T-1 



n -.9*) 



(M - 2r)!n! 



M!(n- 2T)! 



'?(3,iV,M) 



i=0 



where q {g, N, M) is a polynomial of total degree at most ST. The argument that q approximates P to 
within a constant is analogous to that of Lemma ||; note that 



(2n 



,2T 



2T-1 



n (2^-5*) 



1 + ^ 
T n 



2T' 



0(1) 



since g < v}/^ and T < ■ 
Theorem 5 Q2 (SetComp„) = (n^/^) . 

Proof sketch. The proof is analogous to that of Theorem Let g £ [IjG] for some G < Then 
the super-quasilattice points {g, N, AI) all lie in i? = [1, G] x [n,n + nj (lOOT)] . Define di^q) to be 



max I max 

{g,N,M)eR 



dq 
dg 



n/lOOT 
'(G^ 



dq 



ON 



n/lOOT 
' (G^ 



dM 



Then d{q) > S for some constant (5 > 0, by Lemma ^ 

For every point (7],N,M^ g R, there exists a super-quasilattice point {g,N,M) such that |.g — g| < 1, 



N -N 



< G, and 



M - Af 



< K (G). Hence, g N, can deviate from [0, 1] by at most 



01 \I^ + l]d{q) 



Let {g* , N* , M*) be a point in R at which d {q) is attained. Suppose d (q) is attained in the g direction; 
the cases of the N and M directions are analogous. Then q (g, N* , M*) is a univariate polynomial in g, and 

8T>degiq{g,N*,M*)) 



d{q)G 



l + d{q)+d{q)TGyn^ 
-^(nnn{VG,y^ 
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One can show that the bound is optimized when we take G = n2/7 < nV3. Then 

T = n . 
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